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Abstract. Such modern applications of topology as digital image analysis 
and data analysis have to deal with noise and other uncertainty. In this en- 
vironment, the data structures often appear "filtered" into a sequence of cell 
complexes. We introduce the homology group of the filtration as the group of 
all possible homology classes of all elements of the filtration v^ithout double 
count. The second step of analysis is to discard the features that lie outside 
the user's choice of the acceptable level of noise. 



1. Introduction 

Since Poincare, homology has been used as the main descriptor of the topology 
of geometric objects. In the classical context, however, all homology classes receive 
equal attention. Meanwhile, applications of topology in analysis of images and data 
have to deal with noise and other uncertainty. This uncertainty appears usually 
in the form of a real valued function defined on the topological space. Persistence 
is a measure of robustness of the homology classes of the lower level sets of this 
function [6], [2], [Ij, [3]. 

Since it's unknown beforehand what is or is not noise in the dataset, we need to 
capture all homology classes including those that may be deemed noise later. In this 
paper we introduce an algebraic structure that contains, without duplication, all 
these classes. Each of them is associated with its persistence and can be removed 
when the acceptable threshold for noise is set. The last step can be carried out 
repeatedly in order to find the best possible threshold. The approach follows the 
approach to analysis of digital images presented in [8] . 

2. Backgound 

The topological spaces subject to such analysis are cell complexes. A cell 
complex is a combinatorial structure that describes how fc-dimensional cells are 
attached to each other along (fc — l)-dimensional cells. Cell complexes come from 
the following two main sources. 

First, a gray scale image is a real-valued function / defined on a rectangle. 
Given a threshold r, the lower level set /~^((— oo, r)) can be thought of as a binary 
image. Each black pixel of this image is treated as a square cell in the plane. These 
2-dimensional cells are combined with their edges (1-cells) and vertices (0-cells) and 
in the n-dimensional case, the image is decomposed into a combination of 0-, 1-, 
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rt-cubes. This process is called thresholding. The result is a cell complex K for 
each r, see |3 ■ 

Second, a point cloud is a finite set S in some Euclidean space of dimension d. 
Given a threshold r, we deem any two points that lie within r from each other as 
"close". In this case, this pair of points is connected by an edge. Further, if three 
points are "close", pairwise, to each other, we add a face spanned by these points. 
If there are four, we add a tetrahedron, and, finally, any d+1 "close" points create 
a d-cell. The process is called the Vietoris-Rips construction. The result is a cell 
complex K for each r [6]. 

Next, we would like to quantify the topology of the cell complex K. It is done 
via the Betti numbersof K: Bq is the number of connected components in K; Bi 
is the number of holes or tunnels (1 for letter O or the donut; 2 for letter B and the 
torus); B2 is the number of voids or cavities (1 for both the sphere and the torus), 
etc. 

The Betti numbers are computed via homology theory [T]. One starts by con- 
sidering the collection Ck (K) of all formal linear combinations (over a ring R) of 
fc— cells in K, called chains. Combined they form a finitely generated abelian group 
called the chain complex Ck{K), or collectively C^,{K). A fc-chain can be recorded 
as an A'^^-vector, where Nk is the total number of A:-cells in K. The boundary of 
a fc-chain is the chain comprised of all (fc — l)-faces of its cells taken with appro- 
priate signs. Then the boundary operator d : Ck{K) — > Ck-i{K) acts on the chain 
complex and is represented by a Nk x N^-i matrix. 

From the chain complex (A') , the homology group is constructed by means of 
the standard algebraic tools. To capture the topological features one concentrates 
on cycles, i.e., chains with zero boundary, dA — 0. Further, one can verify whether 
two given /c-cycles A and B are homologous: the difference between them is the 
boundary of a (fc-|- l)-chain T : A — B = dT (such as two meridians of the torus). In 
this case, A and B belong to the same homology class H = [A] — [B] . The totality 
of these equivalence classes in each dimension k is called the fc-th homology group 
Hk{K) of AT, collectively H^,{K). Then, Betti number Bk is the rank of Hk{K). 

3. Prior work and outline 

The methods for computing homology groups are well developed. In real-life 
applications however both digital images and point clouds may be noisy and one 
needs to evaluate the significance of their homology classes. The approach to this 
problem has been the following. Instead of using a single threshold and studying 
a single cell complex, one considers all thresholds and all possible cell complexes. 
Since increasing threshold r enlarges the corresponding complex, we have a sequence 
of complexes: 

AT^ a:^ a:^ a:"* . . . k", 

where the arrows represent the inclusions: : K"^ ^ A'"+^. Let z"™ : A'" ^ 

K"^,n < m, also be the inclusion. This structure {Ar",i"™} is called a. filtration. 

Now, each of these inclusions generates a homomorphism i"™ : H^{K") — >■ 
H^{K'^) called the homology map induced by i"™. As a result, we have a sequence 
of homology groups connected by these homoniorphisms: 
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These homomorphisms record how the homology changes as the complex grows at 
each step. For example, a component appears and then merges with another one, 
or a hole is formed and then filled. We refer to these events as birth and death of 
the corresponding homology classes. 

In order to evaluate the robustness of an element of one of these groups the 
persistence of a homology class is defined as the number of steps it takes for the 
class to end at 0. In other words, 

persistence = death date - birth date. 

The p-persistent homology group of is defined as the image of il'^~^^. It's what's 
left from H^, [K^) after p steps in the filtration. Now the robustness of the homology 
classes of the filtration is evaluated in terms of the set of intervals [birth, death] 
representing the life-spans, called barcodes, of the homology classes [3]. 

Our approach is somewhat different. It consists of two steps. 

First step: we pool all possible homology classes in all elements of the filtration 
together in a single algebraic structure (Sections 4 and 5). The presence of noise is 
ignored. The homology group H^,{{K'^}) of filtration {K"} captures all homology 
classes in the whole filtration - without double counting. 

Second step: for a given positive integer p, the p-noise group N^{{K^''}) is 
comprised of the homology classes in with the persistence less than p. 

Next, we "remove" the noise from the homology group of filtration by using the 
quotient (Sections 6 and 7): 

HP{{K"}) = H4{K"})/NP{{K"}). 

In other words: if the difference between two homology classes is deemed noise, they 
are equivalent. The second step can be repeated as needed. 

We also discuss the computational aspects of this approach (Section 8) and 
multiparameter filtrations (Section 9). 

Our approach provides a coarser classification of the homology of filtrations 
than the one based on barcodes. The reason is that all homology classes with long 
enough life-spans, i.e., high persistence, have equal place in the homology group 
-ff*({iir"}) of the filtration regardless of the time of birth and death. 

4. Motivation: the homology of a gray scale image 

In this section we will try to understand the meaning of the homology of the 
gray scale image in Figure 1. For simplicity we assume that there are only 2 levels 
of gray in addition to black and white. A visual inspection of the image suggests 
that it has three connected components each with a hole. Therefore, its 0- and 
1-homology groups shouldhave three generators each. We now develop an algebraic 
procedure to arrive at this result. 




Figure 1. A gray scale image and the corresponding filtration 
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First the image is "thresholded" . The lower level sets of the gray scale function 
of the image form a filtration: a sequence of three binary images, i.e. cell complexes: 

^ K"^ K^, where the arrows represent the inclusions. Suppose Ai,Bi,Ci 
are the homology classes that represent the components of and (ii,bi, Ci are the 
holes, clockwise starting at the upper left corner. The homology groups of these 
images also form sequences - one for each dimension and 1. 

Suppose Fi,F2 are the two homology maps, i.e., homomorphisms of the homol- 
o^y groups ^cricrf itcxi by the inclusions of the complexes, with F3 = included for 
convenience. These homomorphisms act on the generators, as follows: 

Ai A^ ^ 0, Bi ^ B2 ^ S3 -i^ 0, 

C2 -> C3 -> 0, ai -> a2 -> as -J> 0, 
h 0, cs ^ 0. 

To avoid double counting, we want to count only the homology classes that don't 
reappear in the next homology group. As it turns out, a more algebraically con- 
venient way to accomplish this is to count only the homology classes that go to 
under these homomorphisms. These classes form the kernels of ^1,^2, -F3. Now, we 
choose the homology group of the original, gray scale image to be the direct sum 
of these kernels: 

Thus the image has three components and three holes, as expected. 

5. Homology groups of filtrations 

In the following sections we provide formal definitions. All cell complexes are 
finite. 

Suppose we have a one-parameter filtration: 

K'^ ^ ^ ^ ... ^ K\ 

Here K^,K'^, . . . are cell complexes and the arrows represent the inclusions 
■n,«+i . ^« ^ gQ -nm . j^n ^ /-^^^n < m. We will dcnotc the 

filtration by {K^, i"™ : n, m = 1, 2, s,n < m}, or simply {if"}. Next, homology 
generates a "direct system" of groups and homomorphisms: 

H^K^) ^ H,{K^) ^ . . . ^ H,{K') 0. 

We denote this direct system by {H^{K"'), : n, m = 1, 2, s,n < m}, or simply 
{H^,{K")}. The zero is added in the end for convenience. 

Our goal is to define a single structure that captures all homology classes in the 
whole filtration without double counting. The rationale is that if a; e H^{K^),y G 
H^,{K™), y = i^"^{x), and there is no other x satisfying this condition, then x and y 
may be thought of as representing the same homology class of the geometric object 
behind the filtration. 

The hom.ology group of filtration {-ft'"} is defined as the product of the kernels 
of the inclusions: 

Here, from each group we take only the elements that are about to die. Since each 
dies only once, there is no double-counting. Since the sequence ends with 0, we 
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know that everyone will die eventually. Hence every homology class appears once 
and only once. 

These are a few simple facts about this group. 

Proposition 5.1. // i"'"^^ is an isomorphism for each n = l,2,...,s — 1, then 
H,{{K''}) = H,{K^) . 

Proposition 5.2. If i^'^^^^ is a monomorphism for each n = l,2,...,s — 1, then 
H,{{K''}) = H,{K'). 

Proposition 5.3. Suppose {if", i"™, n,m = 1, 2, s} and {L", j"™, n, m = 1, 2, ... 

are filtrations. Then H.,{{K" U L"}) = H.,{{K'^}) ® H^,{{L"}). 

Proposition 5.4. Suppose {if", i"™, n,m = 1, 2, s} and {L", j"™, n,m = 1, 2, ... 

are filtrations and f : if* L'^ is a cell map. Then the homology map of the ho- 
mology groups of these filtrations /* : if* ({if"}) — > i?*({L"}) is well defined as 

f^{xi,X2,--,Xs) = {fl{xi),fUx2),-.;f!{Xs)), 

where /" is the restriction of f to if". 

The stability of the homology group of a filtration follows from the stability 
of its persistence diagram, i.e., the set of points {{birth, death)} C for the 
generators of the homology groups of the filtration, plus the diagonal. It is proven 
in [5] that (£>(/), ^(g)) < ||/ — 5II00, where is the bottle-neck distance 
between the persistence diagrams D{f), D{g) of two filtrations generated by tame 
functions f,g. Function F{x,y) = y — x creates an analogue bottle-neck distance 
for the set of points {persistence} C R and its stability follows from the continuity 
of F. 

6. Motivation: the high contrast homology of a gray scale image 

To justify our approach to persistence, we observe that some of the features in 
the image in Figure 1 are more prominent than others. In particular, some of the 
features have lower contrast. These are the holes in the second and the third rings 
as well as the third ring itself. By contrast of a lower level set of the gray level 
function we understand the difference between the highest gray level adjacent to 
the set and the lowest gray level within the set. 

An easy computation shows that the homology classes with persistence of 3 
or higher among the generators are: Ai,Bi,ai. However, the set of the classes of 
high persistence isn't a subgroup of the homology group of the respective complex. 
Instead, we look at the classes with low persistence, i.e., the noise. In particular, 
the classes in H^,{K^) of persistence 2 or lower form the kernel of F2F1. We now 
"remove" this noise from the homology groups of the filtration by considering their 
quotients over these kernels. In particular, the 3-persistent homology groups of the 
image are: 

ii-3({if'}) =<^i,Si >/0=< >, 

Hf{{K'}) ai,6i > / <bi >=< ai > . 

Observe that the output is identical to the homology of a single complex, i.e., a 
binary image, with two components and one hole. The way persistence is defined 
ensures that we can never remove a component as noise but keep a hole in it. 
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Observe now that the holes in the second and third rings have the same per- 
sistence (contrast) and, therefore, occupy the same position in the homology group 
regardless of their birth dates (gray level). Second, if we shrunk one of these rings, 
its persistence and, therefore, its place in the homology group wouldn't change. 
These observations confirm the fact that the homology group of the gray scale 
image, unlike the barcodes, captures only its topology. 

In the case of a Vietoris-Ripps complex, not only the barcode, [birth, death], 
but also the persistence, death - birth, of a homology class contains information 
about the size of representatives of these classes. For example, a set of points 
arranged in a circle will produce a 1-cycle with twice as large birth, death, and 
persistence than the same set shrunk by a factor of 2. However, persistence defined 
as death/birth will have the desired property of scale independence. The same 
result can be achieved by an appropriate re-parametrizing of the filtration. 



7. Persistent homology groups of filtrations 

In the general context of filtrations the measure of importance of a homology 
class is its persistence which is the length of its lifespan within the direct system of 
homology of the filtration. 

Given filtration {iiT"}, we say that the persistence P{x) of x & H^,(K"') is equal 
to p if C'"^^(a:) = and C'"^^^^(a;) ^ 0. Our interest is in the "robust" homology 
classes, i.e. the ones with high persistence. However, the collection of these classes 
is not a group as it doesn't even contain 0. So we deal with "noise" first. Given a 
positive integer p, the p-noise (homology) group N^{K^) of {K^} is the group of 
all elements of K'^ with persistence less than p. 

Alternatively, we can define these groups via kernels of the homomorphisms of 
the inclusions: iV|'(if") = ker C'"+^. 

Proposition 7.1. Af|'+^(ii'") c N^iR-^). 

Next, we "remove" the noise from the homology group. The p-persistent (ho- 
mology) group of K" with respect to the filtration {K"} is defined as 

The point of this definition is that, given a threshold for noise, if the difference 
between two homology classes is noise, they should be equivalent. 

Next, just as in the case of noise-less analysis, we define a single structure 
to capture all (robust) homology classes. Let p be a positive integer. Suppose 
X e keri^'''+^ and let y = ij'''+^(a;). Then 

= iJ''=+i+P(x) = z^+f''=+f+i(zJ''=+P(x)) 
= iJ+P'*^+f+^(0) =0. 

Hence y G ker ij'*'^''^"'"^"'"^. We have proved that 

^^'=+l(ker^^*^+P) C kcri^+i'^^+^+f . 

It follows that the homomorphism : ker«J'''+^ keriJ+^'''+^+^ generated by 

the inclusion is well-defined. 
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Next, we use these homomorphisms to define the p-noise (homology) group 
NP{{K"}) of filtration {7^"} as 

({^"}) = ker il'^ ® . . . © kevil'^+K 

Observe that the formula is the same as the one in the definiton of {{K^}) . Since 
: kerzj''=+^' ^ keri^+^'^^+^+P is a restriction of : ^ '=+!), 

each term in the above definition is a subgroup of the corresponding term in the 
definition of H^{{K"}). The proposition below follows. 

Proposition 7.2. iV|'({ii:"}) C i/,({A'"}). 

Finally, the p-persistent (homology) group of filtration {K""} is 

HP{{K^}) = H4{K"})/NP{{K"}). 

The results about _ff?({iir"}) analogous to the ones about H^,{{K"}) in Section 
5 hold. 

8. Computational aspects 

For 2-dimensional gray scale images, this approach to homology and persistence 
has been used in an image analysis program. The algorithm described in |8] has 
complexity of O(n^), where n is the number of pixels in the image. 

For the general case, the analysis algorithm may be outlined as follows: 

(1) The input is a filtration. 

(2) The homology groups of its members and the homomorphisms induced by 
inclusions are computed. 

(3) The homology group of the filtration is computed. 

(4) The persistence of all elements of the homology groups is computed. 

(5) The user sets a threshold p for persistence and the p-noise group of the 
filtration is computed. 

(6) The p-persistent homology group of the filtration is computed and given 
as output. 

If the user changes the threshold, the last step is repeated as necessary without 
repeating the rest. 

The algorithm above computes the homology group of filtration, as defined, 
incrementally. This may be both a disadvantage and an advantage. In comparison, 
the persistence complex [3] also contains information about all homology classes of 
the filtration but its computation does not require computing the homology of each 
complex of the filtration. Meanwhile, the above algorithm may have to compute 
the same homology over and over if consecutive complexes are identical. Hence, 
the algorithm has a disadvantage in terms of processing time. On the other hand, 
the incremental nature of the algorithm makes its use of memory independent from 
the length of the filtration. Another advantage is that multi-parameter filtrations 
are dealt with in the exact same manner (see next section). 

The inefficiency of the above algorithm can be addressed with a proper algebraic 
tool. This tool is the mapping cone [9]. Suppose, for simplicity, that our filtration 
has only two elements: i : ^ . The mapping cone is, in a sense, a combination 
of the kernel and the cokernel of i*. It captures the difference between and 
on the chain level: everything in Ct,{K^) is killed unless it also appears in C^,{K^) 
under i*. Then the algorithm is to construct the homology group from the chain 
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complexes C«(-ftr^), C*(i4r^) of the elements of the filtration and the chain map 

9. Multipctrameter filtrations 

Multiparameter filtrations come from the same main sources as one-parameter 
filtrations. First, color images arc thrcsholdcd according to their three color chan- 
nels. Second, point clouds are thresholded by the closeness of their points and, for 
example, the density of hte points. 

Let limit our attention to the two-parameter case. A (finite) two-parameter 
filtrations {iC""'} is a table of complexes connected by inclusions 

i{n, m,n + p,m + q) : K'"^ K"'+P'"'+'' ,p, q>0, 

These inclusions generate homomorphisms 

i4n,m,n + q,m + p): ^ ), 

with Os added in the end of each row and each column. Define the homology group 
of the filtration {ii'"™} as 

H^{{K"''^}) = ker (n, m, n -|- l,m) fl ker (n, m, n, m + 1). 

n 

The analogues of the results in Section 5 hold. 

There are many ways to define persistence in the multiparameter setting. For 
example, we can evaluate the robustness of a homology class x e if, (JT"'") in terms 
of the pairs {p, q) of positive integers satisfying 

i*{n,m,n + p,m) (x) = and i.,(n,m,n,m + q) (x) = 0. 

Next, just as in Section 7, we restrict the homomorphisms generated by the inclu- 
sions to the homology classes of low persistence: 

i^{n, m, n + 1, m) : 
ker z*(n, m, n + p, m) — >• ker u(n -|- 1, m, n + 1 -|- p, m), 

^■^,{n, m, n, TO + 1) : 
ker (n, m, n, TO + q) — >■ ker i»(n -|- 1, to, n, to + 1 + g). 
Then the (p, q)-noise group of K'""^ is defined via these homomorphisms: 
7VP9({if"™}) = ker (n, to, n + 1, to) n ker (n, to, n, to + 1). 

n 

Finally, the {p,q) -persistent (homology) group of filtration {X"™} is defined as 

= H4{K''^"'})/NP''{{K''"'}). 

The results about HS'^ {{K''"^}) analogous to the ones about fff({if"}) in 
Section 7 hold. 

References 

[1] G. Brcdon, Topology and Geometry, Springer Verlag, 1993. 

[2] G. Carlsson, Topology and data. Bulletin of the Amer. Math. Soc, Vol. 46, No. 2, pp. 
255-308, 2010. 

[3] G. Carlsson and A. Zamorodian, Computing persistent homology. Discrete and Compu- 
tational Geometry, 2005, 20th ACM Symposium on Computational Geometry, Brook- 
lyn, NY, 2004. 



ROBUSTNESS OF TOPOLOGY OF DIGITAL IMAGES AND POINT CLOUDS 



9 



[4] G. Carlsson and A. Zamorodian, The theory of multidimensional persistence. 23rd 
ACM Symposium on Computational Geometry, Gyeongju, South Korea, 2007. Discrete 
and Computational Geometry, 2009. 

[5] D. Cohen-Steiner, H. Edelsbrunner, J. Harer, Stability of persistence diagrams. Dis- 
crete and Computational Geometry, vol. 37, no. 1, pp. 103-120 (2007). 

[6] H. Edelsbrunner, D. Letscher, and A. Zomorodian, Topological persistence and sim- 
plification. Discrete Comput. Geom. 28 (2002), pp. 511-533. 

[7] T. Kaczynski, K. Mischaikow, and M. Mrozek, Computational Homology, Appl. Math. 
Sci. Vol. 157, Springer Verlag, NY, 2004. 

[8] P. Saveliev, A graph, non-tree representation of the topology of a gray scale image. 
Proceedings of SPIE, Algorithms and Systems, 2011, Volume 7870, 01-019. 

[9] C. A. Weibel, An Introduction to Homological Algebra, Cambridge University Press, 
1994. 

Robustness of topology of digital images and point clouds 



Department of Mathematics, Marshall University, One .John Marshall Drive, Hunt- 
ington, WV 25755 



